Coding of image sequences with a plurality of image blocks and reference images

ABSTRACT

A coding method codes a sequence of digitized images with a plurality of macro blocks in error-prone networks in which case the macro blocks in a section of the image are coded in a first intra-coding mode depending on predetermined criteria. In addition, the macro blocks in a section of the image are coded in a second intra-coding mode or in an inter-coding mode in which case in the inter-coding mode for the macro blocks, movement vectors are selected from the number of accessible reference images. The selection from the number of accessible reference images is limited in such a way that referencing takes place from image areas that were not subjected to the first intra-coding mode at a later stage.

BACKGROUND

The invention relates to a method for coding a sequence of digitizedimages with a plurality of image blocks as well as a correspondingdecoding method. The invention also relates to corresponding coding anddecoding devices.

Actual video coding standards (for example, see document [1], below)allow the coding of image sequences, wherein macro image blocks used foran estimation of movement are updated by means of an intra-coding mode.As a result, errors are not reproduced in the image sequence. Updatingby means of intra-coding modes can be carried out at regular intervalsor based on predetermined criteria. For other video coding methods,intra-coding modes, that refer back to several previously codedreference images, can be used. However, there are no mechanisms thatallow an efficient video coding with inter-coding modes andintra-updating modes over error-prone networks.

The publication “Proc. Intl. Conf. On Image Processing ICIP, Lausanne,vol.1, 16.09.1996, pp.763-766 (Lio et al)” describes an intra-updatemethod for video coding via channels prone to errors. This methodanalyses the specific sensitivity of macro blocks for channel errors andobtains a specification for the intra-update modes.

The publication “Proc. IEEE ICASSP, San Francisco, vol. 5, 23.3.1992,pp. 545-548 (Haskell et al)” describes several possible methods forresynchronizing movement-compensated videos that are adversely affectedby ATM cell loss.

Accordingly, a system and method is needed for allow an efficient videocoding with inter-coding modes and intra-updating modes over error-pronenetworks. As will be discussed below, a method is disclosed for coding asequence of digitized images that uses a plurality of intra-coding andinter-coding modes as well as a plurality of reference images, to ensurea reliable reconstruction of the digitized images in error-pronenetworks.

SUMMARY OF THE INVENTION

One exemplary embodiment codes a sequence of digitized images with aplurality of image blocks in error-prone networks wherein the macroblocks in a section of the image are coded in a first intra-coding modedepending on predetermined criteria. Furthermore, the macro blocks in asection of the image are coded in a second intra-coding mode or in aninter-coding mode wherein in the inter-coding mode for the macro blocks,movement vectors are selected from the number of accessible referenceimages. The selection from the number of accessible reference images islimited in such a way that referencing takes place from image areas thatwere not subjected to the first intra-coding mode at a later stage. Thishelps to prevent a reference being made in the case of the inter-codingmode to the reference image areas that are subjected at least partiallyto an intra-coding mode. If the coding in the first intra-coding mode iscarried out particularly for reasons of error robustness in order toavoid the reproduction of errors in the case of incorrect transmissions,this ensures that the coding is not based on image areas that weretransmitted incorrectly. Therefore, an efficient and at the same timeerror robust coding, is provided in error-prone networks.

Under the exemplary embodiment discussed above, the coding is carriedout in a first intra-coding mode at regular time intervals.Alternatively, the coding in the first intra-coding mode can be repeatedat random time intervals.

Under another exemplary embodiment, the coding is carried out in asecond intra-coding mode or in an inter-coding mode for reasons ofcoding efficiency. For reasons of coding efficiency, an intra-codingmode is particularly taken into consideration if an object in the imagesequence only appears temporarily in some images.

Under yet another exemplary embodiment, the following steps are carriedout to limit the reference images for coding a macro block. For eachinter-coding mode from the number of possible inter-coding modes and foreach reference image from the number of reference images that can beaccessed by the rate distortion optimized movement compensation,optimized movement vectors are selected from the number of possiblemovement vectors. From a complete number that consists of the variouspossible combination of inter-coding modes and reference images, alimited number is created in which case the combinations that were codedin a later image in a first intra-coding mode are removed. Based on thelimited number and a number of intra-coding modes, the best combinationbased on rate distortion criteria is formed. When the image block in thepreceding aforementioned step was coded with an intra-coding mode, it isestablished in an additional step whether or not the image block wasintra-coded on the basis of error robustness criteria (firstintra-coding mode) or on the basis of the rate distortion optimization(second intra-coding mode). Therefore, an optimum coding mode can bedetermined for macro blocks to be coded. Utilization of rate distortioncriteria can also be performed and is described in greater detail indocuments [3] and [4] below.

Various rate distortion criteria are determined depending on the bestcombination of an error rate to be expected when transmitting the codedimages. In this case, the distortion of the pixel values of the imagesis calculated in order to determine these criteria. The distortion ofthe pixel values preferably contains the total of the quadraticdifferences between the pixel values before coding and thecorrespondingly decoded pixel values. Because the distortion is usuallynot known when coding, it is possible to estimate the distortion in aparticularly preferred embodiment.

In addition to the above-described coding method, a corresponding methodis also disclosed for decoding a sequence of digitized images inerror-prone networks in which case the method is embodied in such a waythat a sequence of digitized images coded with the coding methoddescribed above is decoded. Under an example discussed in detail below,an error concealment is used for decoding.

A device for coding a sequence of digitized images in error-pronenetworks is also disclosed, in which case the device is embodied in sucha way that the coding method described above can be carried out. Theinvention also includes a corresponding device for decoding digitizedimages in error-prone networks in which case the device is embodied insuch a way that the decoding method described above can be carried out.

BRIEF DESCRIPTION OF THE DRAWINGS

The invention and its wide variety of potential embodiments will be morereadily understood through the following detailed description, withreference to the accompanying drawing in which:

FIG. 1 illustrates a section of a sequence of decoded images wherein theimages were previously coded with a method according to the prior art;and

FIG. 2 illustrates a section of a sequence of decoded images wherein theimages were previously coded with the method according to the invention.

DETAILED DESCRIPTION

The image sequence shown in FIG. 1 is coded with the encoder describedin document [1] wherein this encoder carries out intra-updating modes inan intra-coding mode at regular intervals in order to avoid errors frombeing reproduced in the case of an incorrect transmission of the imagesequence in the decoder. The intra-updating modes correspond to thecoding modes in a first intra-coding mode.

The image sequence is transmitted via an Internet test pattern that isdescribed in document [2]. In this case, the image sequence istransmitted in data packets wherein a data packet consists of two rowsof image blocks. In the text below, image blocks are referred to asmacro image blocks whose shifting in the case of the inter-coding modeis determined by means of movement vectors. The coding method, by meansof which the image sequence shown in FIG. 1 was coded, also includes asecond intra-coding mode and an inter-coding mode. In the inter-codingmode, an estimation of movement with respect to a maximum of fivereference image blocks is carried out.

The section of the image sequence shows the images in references No. 9to No. 12 (FIG. 1) of this sequence. In order to improve the display ofthe image sequence, an error concealment by means of gray tones was alsoused. Under the example, when the image sequence was transmitted, onepacket was lost in the first image of the sequence. This transmissionerror is still displayed in image No. 9 of the sequence as can be seenby means of the horizontal lines in image No. 9 of FIG. 1. In image No.10 of FIG. 1, an intra-updating mode of a section of the image blocks iscarried out so that a section of the incorrect image area in image No.10 has disappeared. In image No. 11 an inter-coding mode, by means ofreference images, was carried out in which case the reference images lietemporally in front of image No. 10 and, therefore, do not include theintra-updating mode. Thus, a large part of the incorrect area againappears in image No. 11 of FIG. 1. The same phenomenon again appears inimage No. 12. By means of this phenomenon, not only is the distortion inthe image increased objectively, but the effect in the image is alsofound to be very disturbing from a subjective viewpoint.

The above-described image disturbances can be ascribed to the fact thatfor the coding used in the sequence of FIG. 1, a first intra-coding modeis connected to an inter-coding mode that uses multiple referenceimages. The occurrence of these disturbances could be avoided by notreturning to multiple reference images in the case of incorrecttransmissions, but this would considerably reduce the compressionefficiency.

In order to avoid the above-described disturbances to the greatestpossible extent, the coding method according to the invention limits thereference images to the effect that for the inter-coding mode only suchreference image blocks are used that are not subjected to anyintra-updating mode after the reference image has been coded. Theresults of the method are shown in FIG. 2. FIG. 2 illustrates the sameimage sequence as in FIG. 1 with the difference that the coding methodaccording to the invention was used. From the images in FIG. 2, it isclear that the image disturbances in images No. 11 and No. 12 havedisappeared. This is due to the fact that for the inter-coding mode, noreference images that are transmitted incorrectly to the decoder areused. The increase in the bit rate that results from the methodaccording to the invention is relatively moderate and lies at about 5%.

Exemplary embodiments of the method according to the invention aredescribed in greater detail below. For an embodiment of the method foreach macro block coding mode m is selected from the number of possibleinter-coding modes M_(p) and for each reference image r from the numberof accessible reference images R and optimum movement vectors v(m, r)from the number of movement vectors V(m) for the movement compensation.The selection takes place according to the rate distortion criteria.Mathematically, the rate distortion criteria are displayed as follows:$\begin{matrix}{{v\left( {m,r} \right)} = {\underset{v \in \quad{V{(m)}}}{\arg\quad\min}\quad\left( {{D_{DFD}\left( {m,r,v} \right)} + {\lambda_{Motion}{R_{Motion}\left( {m,r,v} \right)}}} \right)}} & (1)\end{matrix}$in which case D_(DFD)(m, r, v) is the distortion according to themovement compensation and R_(motion)(m, r, v) contains the number ofbits that are needed for coding the specific movement vector. Thefunction ((D_(DFD)(m, r, v)+λ_(motion)R_(motion)(m, r, v)) is aso-called Lagrange cost function that contains the Lagrange multiplierλ_(motion). This function is minimized whereby optimum movement vectorsare determined regarding the distortion and the memory space requirementfor the movement vector. Therefore, as a first result, optimizedmovement vectors v(m, r) are obtained for each reference image r and foreach macro block coding mode m.

In a next step, the number of movement vectors is limited by removingcombinations from the number consisting of the inter-coding modes M_(p)and the reference images R, in which referencing takes place from imageareas that are subjected to an intra-updating mode at a later stage, forexample, for reasons of error robustness. In this way, a number O_(p) ofpossible values m and r is obtained for the movement vectors and this isas follows:O _(p)={(m,r)∈{M _(p) ,R}|s _(min β)(v(m,r),f,k)≧r},   (2)in which case

-   -   k=1, . . . , K is the number of an image block;    -   f the vector {f₁, . . . , f_(K)} in which case the variable        f_(i) is the digit that gives the number of the reference image        for the i-th image block for which the last intra-updating mode        was carried out; and    -   s_(minfi)(v(m,r),f,k) is an operation that determines the number        of the reference image for the image block k depending on v(m,r)        and f that, on the basis of the reference image limitation, is        the last permitted reference image.

If the number of the last permitted reference image exceed the number ofthe reference image r, then it consists of a combination (m, r) whosereference image is within the number of reference images limited by themethod according to the invention. If the last permitted reference imagebe less than the reference image r, then the corresponding combination(m, r) will be rejected.

The limited set O_(p) of reference images and inter-coding modes mresulting from the previous step is combined with a set of intra-codingmodes M_(I) that can be used in the method according to the inventionand the optimized coding mode 0(k) is again determined from the setunion 0={M_(I), O_(p)} for each macro block k by means of the ratedistortion criteria. If this macro block is intra-coded of necessity,for example, by regular or random intra-updating modes then the numberof 0 is limited to only intra-modes, that is 0=M_(I). Of course, in thiscase it is also not necessary to determine O_(p). Mathematically, therate distortion criteria can again be formulated as the minimizingproblem of a Lagrange cost function: $\begin{matrix}{{{{o(k)} = {\underset{o \in O}{\arg\quad\min}\left( {{D(o)} + {\lambda_{Mode}{R(o)}}} \right)}},}\quad} & (3)\end{matrix}$in which case R(o), describes the number of bits, codes the image blockin the coding mode o, and D(o) represents the distortion for this codingmode.

If there is a regular or random intra-updating mode in the exampleabove, the distortion is produced as the sum of the quadraticdifferences between the original image block and the image blockreceived after the decoding. If the intra-updating mode should becarried out on the basis of an error-optimized channel adaptive codingdescribed further below, the distortion is given in the decoder as theexpected value of the distortion.

In a following step, it is still necessary to establish whether or notan intra-coded image block was intra-coded because of error robustnessreasons in order to avoid the reproduction of errors or for reasons ofcoding efficiency. An intra-coding mode for reasons of coding efficiencyparticularly prevails if an object in the image sequence only appearstemporarily. For an intra-coding mode because of coding efficiencyreasons, a reference image limitation is not desired. In order todetermine the reasons for the intra-coding mode, a rate distortionoptimization is again performed according to equation (3), but where thetotal number 0={M_(I), O_(p)} is used and as the distortion measurementthe total of the quadratic differences between the original image blockand the image block received after decoding. The result of theoptimization is designated as ô(k). Subsequently, an error robustnessflag e_(k) is set in which case e_(k)=δ_(o(k)≠ô(k)) and δ_(condition) isthe Kronecker symbol that is 1 if the condition has been met andotherwise has the value 0. Therefore, the intra-coding mode was carriedout for reasons of error robustness if the flag is set to 1.

If all the image blocks of an image were processed, the vector f isupdated for all the entries f_(k) for which the error robustness flage_(k) is set at 1. As a result, a reference image limitation is avoidedfor such intra-coding modes that were performed for reasons of codingefficiency and thus the appearance and disappearance of objects can beefficiently executed by means of coding with the aid of a number ofreference images.

Another exemplary embodiment is described below for which achannel-adaptive reference image is selected on the basis of ratedistortion criteria. For this example, the distortion D(o) has to beestimated in the decoder. Possibilities of estimating this distortioncan be determined by using any of the methods described in documents[5], [6] and [7] below. Another possibility of determining thedistortion is the incorporation of the random channel behavior C whenthe distortion is estimated. After an image n has been transmitted, thechannel behavior C is in this case given by means of the binary sequence{0, 1}^(p(n)) in which case p(n) is the number of packets to betransmitted that are needed to transmit the images 1 to n. In this case,a 0 in the sequence designates a correctly received packet whereas a 1indicates a lost packet. The random variable that describes the binarysequence up to image n is designated as C_(p(n)). The pixel distortionin the decoder will depend on the pixel value reconstructed in thedecoder that is designated as ŝ_(i) and which is unknown to the encodercarrying out the coding. The pixel distortion depends on the channelbehavior C and the selected coding mode o, i.e. ŝ_(i)=ŝ_(i)(C_(p(n)),o). The distortion is estimated as the total of all theexpected values of the quadratic pixel distortions d_(i)(o) of all themacro blocks i in which case it is assumed that the channel behaviorC_(p(n)) is known to the encoder. The pixel distortion d_(i)(o) for themacro block i is as follows:d _(i)(o)=E _(C) _(p(n−1)) |s _(i) −ŝ _(i)(C _(p(n−1)) ,o)|²,   (4)in which case E_(Cp(n−1)) represents the expected value of the quadraticdifference of the original pixel value and the reconstructed pixel valueaveraged over the channel C_(p(n−1)).

In order to calculate the expected value, the following method can beused. It is assumed that T copies of the random variables “channelbehavior” are available in the encoder. These copies are designated asC_(p(n))(t), with t=1, . . . , T. It is also assumed that all the randomvariables C_(p(n))(t) are distributed independently, identically andstatistically. Therefore, according to the strict law for high numbers,T—>∞ is as follows: $\begin{matrix}{{\frac{1}{T}{\sum\limits_{t = 1}^{T}\quad{{s_{i} - {{\hat{s}}_{i}\left( {{C_{p{(n)}}(t)},o} \right)}}}^{2}}} = {{E_{{Cp}{(n)}}{{s_{i} - {{\hat{s}}_{i}\left( {C_{p{(n)}},o} \right)}}}^{2}} = {{d_{i}(o)}.}}} & (5)\end{matrix}$

Therefore, with the expression on the left side, the expected valued_(i)(o) can be estimated and in a next step, the expected distortionD_(i)(o) calculated. The reconstruction of the pixel values depends onthe channel behavior C_(p(n−1))(t) as well as the concealment in thedecoder. By means of the last-mentioned formula it is possible toestimate in the encoder the intensity of the distortion in the decoder.

In addition, although the invention is described in connection withdigitized images, it should be readily apparent that the invention maybe practiced with any type of still or moving digital image format. Itis also understood that the process portions and segments described inthe embodiments above can substituted with equivalent processes toperform the disclosed methods and processes. Accordingly, the inventionis not limited by the foregoing description or drawings, but is onlylimited by the scope of the appended claims.

REFERENCES

-   [1] G. Bjontegaard, T. Wiegand, “H.26L Test Model Long Term Number 8    (TML-8) draft 0.”, ITU-T VCEG, Doc. VCEG-N10, September 2001-   [2] S. Wenger, “Common Conditions for the Internet/H.323 Case”,    ITU-T VCEG (SG16/Q15), Doc. Q15-I-61, Ninth Meeting, Red Bank, N.J.,    October 1999-   [3] T. Stockhammer, T. Oelbaum, D. Marpe, and T. Wiegand, “H.26L    Simulation Results for Common Conditions for H.323/Internet Case”,    ITU-T VCEG (SG16/Q6), Doc. VCEG-N50, Fourteenth Meeting, Santa    Barbara, Calif., September 2001.-   [4] G. J. Sullivan and T. Wiegand, “Rate-Distortion Optimization for    Video Compression”, IEEE Signal Processing Magazine, vol. 15, no. 6,    pp. 74-90, November 1998.-   [5] R. Zhang, S. L. Regunathan, and K. Rose, “Video Coding with    Optimal Inter/Intra-Mode Switching for Packet Loss Resilience”, IEEE    JSAC, vol. 18, no. 6, pp. 966-976.-   [6] G. Cote, S. Shirani, F. Kossentini, “Optimal Mode Selection and    Synchronization for Robust Video Communications over Error-Prone    Networks”, IEEE JSAC, vol. 18, no. 6, pp. 952 -965.-   [7] T. Wiegand, N. Farber, K. Stuhlmüller, and B. Girod,    “Error-Resilient Video Transmission Using Long-Term Memory    Motion-Compensated Prediction”, in IEEE JSAC, vol. 18, no. 6, pp.    1050-1062.

1-13. (canceled)
 14. A method for coding a sequence of digitized imageswith a plurality of macro blocks in error-prone networks, said methodcomprising: coding the macro blocks to determine accessible referenceimages; coding a section of the macro blocks of the images in a sectionof the image in a first intra-coding mode depending on predeterminedcriteria; coding another section of the macro blocks of the image in asecond intra-coding mode, or in an inter-coding mode, wherein movementvectors of the macro blocks are determined and wherein the number ofaccessible reference images selects a specified number of macro blocks;and limiting the selection from the number of accessible referenceimages in such a way that referencing takes place from image areas thatwere not subjected to the first intra-coding mode at a later stage. 15.The method according to claim 14,wherein the predetermined criteria forcarrying out the coding in a first intra-coding mode are errorrobustness criteria with respect to an incorrect transmission of codedimages.
 16. The method according to claim 14,wherein the firstintra-coding mode is executed at regular time intervals.
 17. The methodaccording to claim 14,wherein the first intra-coding mode is executed atrandom time intervals.
 18. The method according to claim 14,wherein thestep of limiting the selection from the number of accessible referenceimages further comprises the steps of: optimizing the detected movementvectors for each inter-coding mode and for each accessable referenceimage; determining a rate distortion movement compensation value foreach of the optimized vectors; and selecting the detected movementvectors in accordance with a determined rate distortion movementcompensation value.
 19. The method according to claim 1 8,wherein thestep of limiting the selection from the number of accessible referenceimages further comprises the step of creating a limited number ofinter-coding mode combinations and reference images, whereincombinations that were coded in a later image in a first intra-codingmode are removed.
 20. The method according to claim 19,wherein the stepof limiting the selection from the number of accessible reference imagesfurther comprises the step of forming a best combination based on therate distortion.
 21. The method according to claim 1 9,wherein the ratedistortion is determined by processing an error rate to be expected whenthe coded images are transmitted.
 22. The method according to claim 20,wherein to determine the rate distortion criteria, the distortion of thepixel values contains the total of the quadratic differences between thepixel values before coding and the correspondingly decoded pixel values.23. The method according to claim 20, wherein the distortion isestimated to determine the rate distortion criteria.